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We present an analytical treatment of a genetic switch model consisting of two mutually inhibiting 
genes operating without cooperative binding of the corresponding transcription factors. Previous 
studies have numerically shown that these systems can exhibit bimodal dynamics without possessing 
two stable fixed points at the deterministic level. We analytically show that bimodality is induced by 
the noise and find the critical repression strength that controls a transition between the bimodal and 
non-bimodal regimes. We also identify characteristic polynomial scaling laws of the mean switching 
time between bimodal states. These results, independent of the model under study, reveal essential 
differences between these systems and systems with cooperative binding, where there is no critical 
threshold for bimodality and the mean switching time scales exponentially with the system size. 
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Gene expression in living cells is regulated by transcription factors that bind to specific DNA sequences thereby 
promoting or repressing transcription of genes. This mechanism allows for a “digital” response: when a cell has to 
make a decision, between expressing a certain protein, A, or another, B, a biochemical regulatory network leads the 
system to a state either dominated by A, or B. Such behavior is called bimodal. An example of such decision-making 
circuits is given by the genetic toggle switch in which two transcription factors mutually repress each other mm- 
This and other genetic switches allow cells to switch between distinct phenotypic states and determine the cell’s fate, 
in response to environmental stimuli and/or internal signals [3HS]- 

Genetic switches are found to exhibit distinct behaviors according to whether or not there is cooperative binding 
(CB) of transcription factors (see e.g. [7] in the context of positive feedback). If GB is in play, more than a single 
transcription factor molecule can bind to the DNA sequence, and the binding probability depends on whether there 
are molecules already bound to the sequence. CB is a driver of bimodality and was previously thought to be a 
necessary condition for a bimodal behavior [HHII]. This is since when CB is present in the rate equations, there are 
(at least) two stable fixed points corresponding to states rich in each type of transcription factors; in contrast, the 
absence of CB yields a single stable fixed point where the two transcription factors coexist. 

Yet, in recent years, it has been shown in different models theoretically [HHni, and experimentally [7], that 
bimodality can emerge even without having bistability at the deterministic level. In [7], bimodality has been reported 
in a synthetic budding yeast system, which concluded that the bimodal behavior is induced by demographic noise. In 
Refs. [131 m], the authors have numerically shown that a genetic toggle switch can exhibit a bimodal behavior due 
to demographic noise, even in the absence of CB. To this end, in Ref. m the exclusive switch model (ESM) was 
analytically studied via the probability generating function. Yet, their analysis, valid only in limiting cases, cannot 
uncover how demographic noise gives rise to bimodal dynamics. Thus, the mechanism of noise-induced bimodality in 
such systems without CB remains unclear. 

In this Letter we present an analytical treatment of the ESM, see Eig. which is found, e.g., as a coarse-grained 
description of the lysis-lysogeny switch of phage A mm- We begin by analyzing the case of equal degradation rates 
of the transcription factors. We show that bimodality is driven by multiplicative noise, thus the bimodal states 
correspond to states for which the noise in the system vanishes. We further find a transition between the bimodal 
and non-bimodal regimes controlled by the noise strength, and identify the onset of bimodality as function of the 
repressor strength. Finally, we show that the mean switching time (MST) from a state rich in A to a state rich in B 
scales polynomially in the system size, unlike typically found in bistable systems. These claims are then generalized 
to the case of different degradation rates using an adiabatic approximation. Finally, we show that our results hold 
for other models displaying noise-induced bimodality such as the general toggle switch uniiiij. Our analysis is also 
available in the Supplemental Material (SM) and Mathematica files. 

The genetic toggle switch models mutual inhibition and degradation of transcription factors. In the case of ESM, 
there is an overlap between the promoters of A and B preventing simultaneous occupation of the two [ini[i3] , see 
Fig.[T] Thus, at the deterministic level, the dynamics of the free proteins A and B, and the bound proteins, ta and 
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FIG. 1: Top: a schematic plot of the ESM M- The repressors A and B cannot be bound simultaneously due to overlap between 
their promoter sites. Middle and bottom: the difference and sum of the copy numbers of A and B obtained from stochastic 
simulations m, with a = 0.01, A: = 10 and g ~ I- 


rB, satisfy the following set of equations [14] 

hi = gAil-rs) - dATii-Koni{l-rA-rB) + kita 
n2 = gsi^-TA) - dBn2-Kon2{l- ta-tb) + hitb 
TA = Atoni(l --CA - rs) - KirA 

tb = Ko«2(l --ta - - Kirs. (1) 

Here ni and n 2 denote the copy-numbers of proteins A and B, respectively. Also, gA and gB are the maximal 
production rates of proteins A and B, and c^a and ds, the corresponding degradation rates. In addition, the bound 
repressors ta and r^, 0 < ta^tb < 1, are bound A and B proteins that monitor the production of B and A, 
respectively, kq denotes the binding rate of proteins to the promoter while ki is the dissociation rate. 

For simplicity we will henceforth assume gA = gs = g- In the limit of dA,dB ^ ki, the relaxation of the bound 
proteins is fast compared to that of the free proteins. As a result, in this limit, one can adiabatically eliminate the 
fast variables ta and rs and arrive at a set of two Michaelis-Menten-like rate equations for ni and n 2 |14j : 

hi =/i(ni,n2) - aini , ri2 = f2{ni,n2) - a2n2, (2) 

where fi{ni,n 2 ) = (1 -I- kni)/{l + kni kn 2 ). Here we have defined the dimensionless repression strength k = kq/^i 
as the ratio of the binding and unbinding rates, ai = dA/g and a 2 = dB/g are the rescaled degradation rates of A 
and B, and we have rescaled time t —>■ gt. We will further assume that ai = a 2 = ct, which will be generalized later 
on. 

In this paper we focus on the strong repression limit, krii 2^ 1 (z = 1,2) [2], which is found {e.g.) in a bacterial 
genetic switch [5]. Since at the fixed point of system ([^ ^ a~^, see below, the strong repression limit becomes 

e = a/k 1, and one can naturally define the concentrations of A and B hy xi = ani, X 2 = an 2 , respectively. 
The scaling of the fixed points allows us to introduce the effective system size a~^. Yet, while a~^ is proportional 
to the physical system size N originating from system Q, they are not identical. In the SM we discuss in detail the 
relationship between our rescaled parameters and the physical system size, and we also comment about the biological 
relevance of our approximations. Finally, note that at the fixed point, rii = — {1 + s)/{2a), see SM, indicating 

that, in the deterministic limit, the system converges into an equal state of A’s and B's. 

To account for demographic stochasticity ignored by Eqs. ([^, we can write down the corresponding master equation 
for the probability Pni,n 2 to find ni and n 2 molecules of type A and B, respectively. Defining the step operator 
E^F{n) = F{n ± 1), we have (see SM): 

Pni,n2 = [{E~^-Plfi{ni,n2) + {F~^-l)f2{ni,n2) 

-I- ai{F^^ - l)ni -I- a 2 {F^^ - 1 ) 712 ] Pni,n 2 


(3) 
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Using the Gillespie algorithm |16j . stochastic system (|^ is simulated and shown to exhibit bimodality in some range 
of parameters (middle panel in Fig. [^, in sharp contrast with the deterministic dynamics ([^ [I^ 114) . 

To this end, we introduce two auxiliary variables: the total concentration, w = xi + X 2 , and the (adimensional) 
concentration difference u = (x\ — x- 2 )l (xi +X 2 ). Note that u « ±1 when the system is rich in one type of transcription 
factor, whereas u « 0 at the deterministic fixed point. For strong repression, e <gC 1, the joint stationary probability 
density function (PDF), Vs{u,w), decouples and satisfies Vs{u,w) = Ps{u)Rs{w) (see SM). Here 

Rs{w) = {2TTa)~^^‘^e~^ " , (4) 

indicating that the sum of A’s and B’s, represented by w, is approximately conserved. To find Ps{u), we consider its 
Langevin equation (see SM) 


du/di = —u + y/k\/l — v?"q{i), ( 5 ) 

where i = 2geat = 2ga^t/k, t is the physical time used in Q, and r]{t) denotes normalized Gaussian white noise. 

Equation ([^ captures the stochastic dynamics of the system. It has already been treated in previous works [HI [11], 
and suggests an explanation for the occurrence of bimodality in the genetic toggle switch. The deterministic drag, 
—u, attracts the system to the stable fixed point, u* = 0, but since at this state the noise has maximum strength, Vk, 
the value of u is driven away, toward those states at which the noise vanishes, u = ±1. These are the bimodal states 
and replace the deterministic fixed points in the GB case. How does this result depend on the repressor strength fc? 
Our previous argument has assumed that the noise strength at fixed point is large enough to oppose the deterministic 
drag. Yet, taking /c —>■ 0, yields ii = —u, and thus u{t) —>■ 0 as t —)■ 00 . We can thus expect that for small fc’s, the 
system fluctuates around u = 0 without exhibiting bimodality. This transition from unimodality to bimodality is 
elucidated by the stationary PDF, Ps{u), of Eq. ([^ [H]. We find 

P,{u)=M{1-uY~''^^\ ( 6 ) 

where J\f = T {k~^ + 1/2) /[\/ttT (^~^)] is a normalization constant such that Ps(u)du = 1. Defining the critical 
repressor strength, fee = 1 (where the PDE concavity is changed), we find two distinct regimes: non bimodal, k < kc, 
where the system displays Gaussian fluctuations around the fixed point u* = 0, and bimodal, k > kc, where the system 
exhibits bimodality and switches between the states u = ±1. In Eig. Eq. (|^ excellently agrees with simulations for 
different values of k. Finally, that PDF § satisfies |Ps(m + Q:) — Ps(m)| Ps{u), at m G (—1,1), validates a-posteriori 
the Fokker-Planck approximation, see SM, to the master equation @ nnun]. 



FIG. 2 : Left panel: The PDF Ps{u) for different values of k. For fc > 1 , a bimodal PDF appears, for fc = 1 the PDF is flat and, 
for fc < 1 , unimodal with a peak on it = 0 . Solid lines are given by Eq. (|m while markers are obtained by simulations [T6|, with 
a = 0 . 01 . Right panels: The MST as a function of a (upper right pan3) and fc (lower right panel) for g — 1. Each marker is 
obtained by averaging 200 numerical realizations m, whereas solid lines are given by Eq. 0 - 
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Equation ([^ also allows calculating the MST between the bimodal states [miiH]. In the bimodal regime, the MST 
T is the mean time it takes the system to go from a state rich in one transcription factor, say rt = 1, to a state rich 
in the other, u = —1, or vice versa. As shown in [n US], for A: > 1, the MST of Eq. (§ reads 

T^{k + 2)/{ga^), (7) 


where we have restored the original time units used in Q. This result (checked against simulations in Eig.[^ depends 
polynomially on the effective system size in contrast with the usually found exponential dependence of the mean 
escape time in bistable switches, see e.g. Refs. P^26) . Hence, the absence of CB allows for much more frequent 
switching between different phenotypic states, which can be beneficial, e.g., in cases of severe stress m- 

The previous results can be generalized to the case of different degradation rates, which can be analyzed using an 
adiabatic elimination of the w variable milHllig. A similar treatment can also be used to investigate the case of 
different repression strengths fci ^ ^ 2 . Yet, as can be checked, for £ ^ 1 the effect of uneven k’s on the PDE and 
MST is much weaker that the effect of uneven a’s. 

We again consider Eqs. ([^ assuming, without loss of generality, 02 < Oi, and denote ai = a and 02 = 6a, where 
5 G (0,1]. Defining u = (si — X2)/{xi + X2) and w = xi + X2, where Xi = ani and X2 = an2 are the concentrations, 
the stationary PDE, Qs{u), of finding concentration u, reads (see SM for details) 


Qs ('^) 
X exp 


= ZPs{u){l + u5 — u5) ^ “(i+i) 


1-5 

TT5 


2u + In 


1 — It 

1+u 


( 8 ) 


where Ps{u) is given by Eq. (§, and 2^ is a normalization factor such that Qs(u)du = 1. Our theory [Eq. (|^] 
excellently agrees with simulations, see Fig. 



FIG. 3: Left panel: Qs(u) [Eq. ([^] (solid lines) is compared for different values of 6 against simulations [16] (symbols). Here 
k = 5 and a = 0.01. Right panel: MST r versus 1/a, for k — 50 and g = 1. Each marker is obtained by averaging 200 
numerical realizations, while the solid lines are given by Eq. ([^ with A = 50 for S — 0.8 and A = 100 for S — 0.9. 

The PDF (|^ is a tilted version of PDF Q; indeed, the former reduces to the latter for 5 = 1. Since we have 
chosen 5 < 1, we find that the system resides most of the time at the metastable mode of u = — 1 and occasionally 
jumps to the transiently metastable mode of m = 1 (the opposite would occur for 5 > 1). Similarly as for the case of 
5 = 1, by decreasing k there exists a transition from a state rich in one type of transcription factor to a state where 
both types coexist, although not equally. Again, this is determined by a critical repressor strength kc, satisfying 
kc = 2/(1 + 5), see SM. For k > kc, both u = 1 and u = —1 are noise-induced metastable states, although the system 
is biased toward u = —1 as the degradation rate of the corresponding protein (of type B) is smaller. In contrast, as 
k is decreased below kc, the PDF flips, and peaks at u* = — 1 -|- 0(e), see SM. 

Since the MST r from u = —1 to u = 1 turns out to depend exponentially on the effective system size a~^ (see 
below), given Eq. r satisfies in the leading order r ~ (5s(—1)/inin[(5s(u)] [22l|30]. Here, the minimum of Qs{u) 
is obtained in the close vicinity of u = 1, satisfying — 1 — 2e{k/kc — 1)/(1 — 5) ~ 1. As Qs{u) diverges at u = —1, 
we thus compute the limit lima->.o Qs(—1 + o.)/Qs(um) and find, in the leading order of £ 1 

2 , 1' 
a(l + 5) 


A 

T ~ — exp 
ga 


( 9 ) 
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Here A = A(k, S) is an unknown prefactor, and we have restored the physical time units. Equation agrees well 
with simulations, see Fig. and in contrast to Eq. Q, depends exponentially on the effective system size. 

Finally, we can use the analysis above for other models that exhibit noise-induced bimodality such as the general 
toggle switch, described by Eqs. (H with 

fi(ni,n2) = [1 +, i^j = 1,2 (10) 

where the Hill coefficient is ft. = 1 El- In principle, the analysis can be done in the same manner as for the ESM. 
Yet, the task is slightly more difficult since the Langevin equation for w = xi + X 2 does not yield a Gaussian PDF for 
i?s(ui), which makes the equation for u less tractable. Nonetheless, we have numerically found the onset of bimodality 
to be at ft > ftc = 1 and that the MST behaves similarly to the ESM, see Fig. In sharp contrast, the genetic toggle 
switch model with CB, for which fi{ni,n2) are given by Eq. ( [To| ) with Hill coefficient ft > 2, displays (at least) two 
stable fixed points. In this case there is no threshold for bimodality when e <C 1, and one expects an exponential 
dependence of the MST on the system’s size m- In Fig. 1^ we compare the MSTs and PDFs of several models with 
and without CB. Our simulations indicate that the MST in the case of CB with ft > 2 yield a stretched-exponential 
dependence of the MST on the system’s size. This is a nontrivial result and requires a further study. While this is 
beyond the scope of this paper, we believe the formalism we have developed can be used to study toggle switch models 
with CB as well, as long as we are in the strong repression limit. 
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FIG. 4: (Top) MSTs for five different models: ESM, general toggle switch (TS) without (w/o) CB, and TS with (w) CB with 
ft = 2, 2.5, 3, for k = 1.5. Each point is obtained by averaging 200 realizations. (Bottom) PDFs of the difference and sum of 
the copy numbers ni and n 2 , for fc = 5 and a = 0.04. While P{n\ -|-n 2 ) almost coincides for all models, the “potential barrier” 
for switching given by max[P(ni — n 2 )] — min[P(ni — n 2 )], is much shallower for models without CB. 


We have presented an analytical treatment of the ESM demonstrating a bimodal behavior in the absence of two 
stable fixed points at the deterministic level. Bimodality is induced by multiplicative noise: the noise strength vanishes 
at the bimodal states whereas it is maximal at the single stable fixed point. This phenomenon, which has attracted 
much interest in various fields [niiMss], is linked here to previous numerical mui] and experimental [5] findings 
on the genetic toggle switch. 

We have shown that bimodal behavior ceases to occur if the noise strength in the system, controlled by the repression 
strength k, is reduced below a critical threshold. This transition, absent in bistable systems, is similar to that found 
in other noise-induced bimodal systems [niisiisz]- Moreover, we have shown here that the MST between bimodal 
states exhibits a polynomial, rather than exponential, scaling on the system size. In genetic toggle switches, the noise 
is controlled by the repression strength k, suggesting that bimodality can be achieved or lost by biological fine tuning 
of reaction rates. 

We would like to thank Ofer Biham for valuable discussions. This work was supported by Grant No. 300/14 of the 
Israel Science Foundation. T.B. acknowledges partial support from the National Aeronautics and Space Administra¬ 
tion through the NASA Astrobiology Institute under Cooperative Agreement No. NNA13AA91A issued through the 
Science Mission Directorate. 
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The exclusive switch model in the case of equal degradation rates 

Model definition and deterministic dynamics 


Our starting point here are rate equations (2) in the main text, obtained in the limit of fast binding/unbinding 
compared to other processes in the circuit [14] . see main text. These equations describe the dynamics of the mean 
number of A and B proteins, denoted by rii and n 2 , respectively, and read 


hi 


1 + kni 
1 + kni + kn2 


— aini. 


h2 


1 + kn2 
1 + kni + kn2 


— a2n2. 


(SI) 


Here, we have denoted the repression strength by A; = kq/ki, which is the ratio between the binding rate kq and 
unbinding rate Ki to/from the promoter. In addition, we have taken gji = gs = g, rescaled time t —>■ gt, and denoted 
the rescaled degradation rates ai = dA/g and a 2 = ds/g, see main text for the definition of parameters. 

System (SI I admits a unique positive stable fixed point. Assuming ai = a 2 = a (we will relax this assumption 
later on), we find 


n, = n-i = 


k — a \/+ kfi 6ak 
ika 


(S2) 


From now on we will consider the strong repression limit, e = a/k 1, meaning that degradation is assumed to 
occur much slower than inhibition. In this limit, the fixed point ( |S2[ ) simplifies to — l/(2a). As a result, 

in the strong repression limit we can naturally define an effective system size as a~^. By additionally defining the 


protein concentrations Xi = ant, i = 1,2, the rate equations (SI I become 


Xi = 


xi + e 

Xi + X2 + £ 


Xi, 


X2 = 


X 2 + e 

Xi X2 e 


X2, 


(S3) 


where we have further rescaled time by t —> at. Equations (S3) are governed by a single parameter, e, although a 


still appears in the definition of the concentrations, a fact that we shall use to carry out the expansion of the master 
equation. 


Relationship between model parameters and physical system size and biological relevance of parameter values 


As discussed in the main text, system (SI) is obtained by applying an adiabatic approximation of the following 
system of ODEs: 

fii = gAil-rB)-dAni-Koni{l-rA-rB) + KirA 
= 5b(1 - £a) - dBn2 - ko«2(1 - - xb) + kUb 

r-A = Konffl-TA-rB) - KiTa 

r-B = Kon2{l-rA-rB) - kiTb, (S4) 

where gA = Qb = 9 are the maximal production rates of proteins A and B, dA = dB = d are the corresponding 
degradation rates, and ta and tb the copy numbers of bound repressors. The constant kq denotes the binding rate 
of proteins to the promoter while ki is the dissociation rate. Recall also that these parameters are related to those of 


system (SI) via k = kq/ki and a = djg. 


Let us denote the physical system size by N. Changing the physical system size signifies using a larger reservoir 
with a larger number of molecules, but without changing the rules between the single interacting molecules. Since 


some of the terms in system (S4) depend on the number of molecules in play (and some others do not), it is the 
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purpose of this section to identify how the parameters in system (S41 scale with respect to N. Clearly, we expect 
according to the definition of the system size that rii ^ N and also, that the degradation rates cIa and ds do not 
depend on N. 

The adiabatic reduction assumes that the variables and evolve faster than rii and n 2 - This indicates that 


the terms on the right hand side of the third and fourth equations of system (S4) approximately balance each other, 
and we find e.g. 


Koni{l -rA- tb) ~ HiTa- 


(S5) 


As a result, since ta^tb we find that k = kq/ki 


N 


-1 


Moreover, close to the fixed point, one can 
follow a similar argument for the first and second equations of system (S4), and find that g ^ N~^, while a ^ N~^. 


Therefore, both the repression strength constant k and the effective degradation rate a are inversely proportional 


to the physical system size N. Yet, in our rescaled model, because of the scaling of the fixed point (S2) on a, it is 


legitimate to define the concentrations variables by Xi = ant (i = 1,2), rather than Xi = N~^ni. Note, that for a 
given N, one can increase/decrease the effective system size a~^ by changing the physical degradation rate d, while 
keeping k constant. 

We now discuss the biological relevance of the approximations used throughout the paper. We make use of three 
assumptions: (i) k is assumed to be on the order of one; (ii) the strong repression limit implies that e = fc/a <C 1; 
(iii) a is assumed to be small so that we can expand the master equation in a <C 1. 

Assumption (i) can always be obtained by rescaling time and the copy number variables. If assumption (i) is 
satisfied, then assumption (iii) follows from (ii). Therefore, we only need to justify assumption (ii) which is the strong 
repression limit £ <s; 1. As stated, this limit assumes that protein degradation occurs much slower than inhibition. 
Indeed, this assumption has often been invoked in previous theoretical works mui]. Moreover, in Ref. [S], the 
genetic switch model is compared to an experimental switch engineered using E. Coli. It is found that in order to 
have bimodality, the inhibition rate should be 15 times faster than the degradation rate in one species, and 150 times 
faster in the other (see caption of Fig. 5 in Ref. [8]). This number provides realistic values for Furthermore, a 
similar value of e = 0.04 appears in Ref. [TO] . 


Stochastic dynamics and Fokker-Planck/Langevin approximation 


The stochastic exclusive switch model can be obtained by viewing the terms of system (SI I as microscopic rates of 
production and degradation of transcription factors. Let us denote by and Ti^ the production rates, and by Tf 
and T 2 the degradation rates of proteins of type A and B, respectively. Given a state with ni and n 2 particles of 
type A and B respectively, these transition rates satisfy 


T+(ni,n2) = 
T2+(ni,n2) = 


1 + krii 


1 + kni + kn2 ’ 
1 + kn2 

1 + kni + kn2 ’ 


(ni,n2) = ani, 
T2 ini,n 2 ) = cm2. 


(S6) 


Using these, we can write down the master equation for the evolution of the probability density function (PDF) 


, (t) that the system is in state ( 711 , 712 ) at time t: 


711,712 


— T) (tIi 1, 77 . 2 ) 7^1 —l,n2 t tl2)Pn\^712 P ^2 (^1:^2 l)Fn^ ^^2 —1 ^2 (^11 ^2)Rni ,n2 

TT) (tIi -\- 1, 772)Fni +1,712 (^ 1 5 ^2)Pn\ ,712 ~b ^2 (^1)^2 4“ ^^Pni ,772 + 1 ^2 (^ 17 ^2')Pn \ ,712 ■ 


(S7) 


Equation (S7| with transition rates (S6) defines the stochastic model for the case of equal degradation rates, and can 


be simulated using the Gillespie algorithm 


Let us now approximate master equation (S71 into a Fokker-Planck equation. We do so by Taylor expanding 

Pi {"^1 ^i'^2)Pni^l,n2 —^1 (^1) ^2)7^71^ ^712 T [T) (^1! ^2)7^777 ^772] + — 9,77 [Tj (111,712)^777,772]) 


(S8) 


and same for . We now move to the concentration variables Xi = ant, i = 1,2, and use the fact that 3^ = ctdxi, 
which turns the Taylor expansion into a system size expansion. Employing this system size expansion on master 
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equation ( |S7| ), valid when the system size is large 1/a ^ 1, we arrive at the following Fokker-Planck equation for 
P{xi, X 2 ,t) - the probability to find concentrations xi and X 2 of A and B proteins, respectively, at time t: 


datPixi,X2,t) = 




* j'=i 


P{xi,X2A)- 


Here we have neglected 0(oA) terms or higher, time is measured in at units, and 




Xi+ e 
xi+ X2+ e 


Vi, and Bij = 


Xi + e 

X1+X2+ e 


“t“ Xi I Sii. 


Equation (S91 is equivalent to the system of Langevin equations defined in the Ito sense 

dxi 


d{at) 


= Ai + y/a\/^if]i{at), i = l,2, 


(S9) 


(SIO) 


(Sll) 


where r]i(t), i = 1,2, are independent normalized white Gaussian noises. 

As a final remark, note that system (Sll) is a better approximation to the master equation than the linear noise 
approximation obtained, e.g, with the van Kampen expansion |29j . In fact, system (Sill captures the multiplicative 
nature of the noise which, as we shall show, is the main driver of bimodality. The expansion technique that we have 
used is described in more detail in 1551. 


Simplifying the Langevin eguations 


We now introduce the new variables 


W = Xi + X2, U = 


Xi — X2 
Xi + X 2 ' 


(S12) 


and change variables in the Fokker-Planck equation ( |S9[ ) or, equivalently, in system (Sll) using the Ito formula 
(the latter is simpler). Doing so, and neglecting 0{y/e) terms in the noise coefficients, we arrive at 


dw 

d{at) 


—w^ — we + w -\- 2e 

W + £ 


+ \fa\J\ + w r]i{at), 


du 

d{at) 


= —2u£ 


w‘^{w + e) 


+ (I - 


I + w 




ri2{at). (SIS) 


Following the main text, we indicate by Vs{u,w) the stationary PDF of Eqs. ( S13[ ). Equations (SIS) need to be 
decoupled in order to be analytically tractable. Simulations (lower panel of Fig. 1 in the main text) indicate that the 
total concentration w fluctuates around the only stable fixed point (l — e + Vl + be + e^) /2 ~ 1 + £, a fact we wish 
to exploit for decoupling the two variables. We can see this effect by Taylor expanding in e <C 1 the deterministic 
part of the first of Eqs. (SIS), 


—w^ — we + w + 2£ 

W + £ 


}{w — w*), where w* = 1 + e. 


(S14) 


Hence, it is legitimate to linearize the first of Eqs. (SIS) around w* which, in the leading order of e <C 1, becomes 

dw 


d{at) 


= —{w — w*) + \/^r]i{at). 


Therefore, the stationary PDF for finding concentration w reads 


Rsiw) = 


1 


V2 


exp - 


7ra 


[■u; - (1 + e)]" 


(S15) 


(S16) 


This indicates that for e ^ I, the total number of A’s and H’s, represented by w, is approximately conserved. 

We now approximate the second of Eqs. (SIS I by setting w equal to w*. We further simplify this equation by noting 
that 


w — a 
w*^{w* + e) 


I, and 


w* + 1 




(S17) 
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and by rescaling time by 2e. As a result, the Langevin equation for u reads 


^ = —u + Vk\/1 — V? r]2{i), 
at 


(S18) 


where i = 2£at = {2a^/k)t, and t is the time used in Eq. (S7). 
already rescaled time t 


Note that in order to reach Eq. we have 


gt^ so the physical time units contain an additional 1/g factor, see main text. We denote 
the stationary PDF of Eq. (S18) by Ps{u). We have thus established that Vs{u,w) = Ps{u)Rs{w). Eq. (S18) and 


its stationary solution are further discussed in the main text, where we find that the mean switching time scales 
polynomially with a~^. Yet, verifying this result experimentally is expected to be highly nontrivial, as trapping of 
single cells over many cell cycles is required (e.g. by using a microfluidic device) to carry our such measurements. 


The exclusive switch model in the case of different degradation rates 


Model definition and deterministic dynamics 


The exclusive switch model (SI) can be generalized to the case of different degradation rates: 

1 + kni 


ni = 


1 + kni + kn2 


— Oirii, 712 — 


1 + kn2 
1 + kni + kn2 


— a2n2. 


(S19) 


We assume, without loss of generality, 0:2 < oi, and denote ai = a and 02 = Sa, where 6 G (0,1]. The concentrations 
of A and B are given respectively by Xi = ani and X 2 = 0 : 712 ■ Rescaling time t ^ at and using e = a/ k, we arrive 
at the following rescaled rate equations 


Xi+ £ . X2+ £ „ 

Xi = - -- — Xi, X2 = - -- - -- — 0X2- 

Xi+X2+£ Xi+X2+£ 


(S20) 


The corresponding stochastic model is described by a master equation given by Eq. (S7) with rates (S6) where here 
T 2 ~( 711 , 712 ) = San 2 - Note that in this case we keep our original system size definition as a~^. 


Derivation of the Langevin equations and analysis 


Expanding master equation (S7) in a similar way to the previous section, we arrive at the following system of 
Langevin equations: 


dx] 


Xi +£ 


d(at) xi + X 2 + £ 
dX2 _ X2+ £ 

d{at) xi + X 2 + £ 



Xi + X 2 + £ 


xi 771 (of), 

+ Sx2 772 (of). 


(S21) 


These equations reduce to system (Sll) for i5 = 1. 

We now change variables in system (S21) using the Ito formula m, from (a:i,a; 2 ) to {w,u) defined in Eqs. ( S12| |. 
We obtain 


d{at) 

where the deterministic part for w reads 


=Aw + Giir]i{ctt)+Gi2r]i(.ctt), = Au + Q2i'ni{o‘t) + G22V2iat), 


d{at) 


Aw = X I duw — uw — 6w — 


- 7U + 4 , 


2w 

W + £ 

and the deterministic part for u, written as a term independent of 5 plus a correction, is 

(1 — 'uP‘{a + Tc) — v?"w£{a + 77 ;)+ w£{a + w) 


An = -2 u£ ^ . + ((5 - 1)- 


w^{w + e) 


2w^{w + e) 


(S22) 


(S23) 


(S24) 
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In addition, the expression for the noise matrix neglecting 0{^/e) terms, reads 


g = 


'Ja / ^w^(u + l)(w + 1) 


(S25) 


7 a /2 V (1 - m )\/(1 + u)(w + 1 ) -{1 + m )\/(1 - ' u ){ w 8 + 1 ) 

As a hnal remark, note that instead of matrix Q one can just reabsorb the noise coefficients by dehning new noise 
variables, and ^ 2 , so that Eqs. (S22) become 


dw 

d{at) 


— Au 


'6 (at), 


du 

d{at) 


= Au + ^2{at). 


(S26) 


Here, the correlator of the two noise variables reads = Bij{u,w)6{t — t'), with B = This is indeed 

the result that one arrives at by applying the change of variable (S12) to the Fokker-Planck equation rather that using 
the Its formula on the corresponding Langevin system. Yet, we have pursued the latter approach since it is simpler. 
In order to decouple Eqs. (S22), we compute the single u-dependent positive fixed point of the w equation, w*{u). 

3 arrive 

w*{u) 


Taylor expanding this fixed point in e <§; 1, we arrive at the expression: 

2 


S — Su + u + 1 


(S27) 


We now employ an adiabatic elimination [39j of w by substituting w = w*{u) into the u equation. Moreover, we sum 
the two noise terms using the summation rule for Gaussian variables [19j 

02 i??i(at) + C/ 22 ?? 2 (at) = \jg 21 + = y/B^r]{at), where Bun = a (l - | ^ ~^2w*{u)'^ ^ | , 

(S28) 

and r](t) is normalized white Gaussian noise. Substituting Eq. (S27) into the expression for B^u, we arrive at the 
simple formula 


Buu ~ ' 2 ^^ ~ + 5)[<5 + (1 — S)u + 1]. 


(S29) 


We now simplify the deterministic part of the u-equation, Eq. (S24), by neglecting small terms. The first term of 
Eq. (S24) is simplihed according to Eq. (SI7), as done in the previous section. The second (5-dependent term is 
approximated by taking the leading order result with respect to a <C 1 and e <C I. As a result, we arrive at the hnal 
Langevin equation for u: 


= -((5 — I) (1 — u^) — -ue{5 — Su + u + 1)'^ + (1 ~ -I- 5)[(5 -I- (1 — 6)u + 1] r]{at). (S30) 

This equation describes the model for the case of different degradation rates and its stationary solution is discussed 


du 

d{at) 


in the main text. The stationary PDF, Qs(u), for hnding concentration u can be found from Eq. (S30) and reads 

1 — u 


Qsiu) = Z [1 - u'^) (1-I-u-I-(5 - uJ) ^ °(i+i) exp ( j -|- j 


2u + In 


I -I- u 


(S31) 


where Z is a normalization factor such that Qs(u)du = I. 


Investigation of the tilted PDF in the case of different degradation rates 


PDF (S3I) is a tilted version of the PDF in the equal a case, see Eq. (6) in the main text. Indeed, the former 
reduces to the latter for 5 = 1. To remind the reader, without loss of generality we assume 5 < 1. When k > kc 
(where kc is the critical repression strength below which bimodality is lost, see below), both u = 1 and u = — 1 are 
noise-induced metastable states. For 5 < 1, we hud that the system resides most of the time in the metastable mode 
of It = —1 and occasionally jumps to the transiently metastable mode of u = 1 (the opposite would occur for 5 > 1). 
This is because the degradation rate of protein B is smaller in this case. In contrast, as k is decreased below kc, first 
the mode at u = 1 and then the mode at u = —1 are lost, and eventually the PDF flips, and becomes a unimodal 
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PDF with a peak at u* which is the mode of Eq. (S311. Differentiating the logarithm of PDF (S31) once and equating 
to zero, we find the PDF mode at k < kc to be at 


u 


* 


6{6 + l)a 
kil-6) 



(S32) 


In order to find kc in the case of different degradation rates we evaluate k at which Qsiu) ceases to be bimodal. 
As k is decreased, the mode at m = 1 disappears before the mode at w = — 1 does. As a result, kc can be found by 
checking when the concavity at u = 1 of Qs{u) changes sign. Alternatively, since for k > kc, Q's(u 1) —)■ oo, while 
for k < kc, Q'siu 1) —>■ —oo, we can find kc by demanding that at k = kc, the first derivative at m = 1 does not 
diverge. Using Eq. ( |S31| ), we differentiate the logarithm of Qs{u) once and evaluate the result in the vicinity of u = 1. 
In the limit of a ^ 1, the result is 


dlnQsju) ^ Q',{u) ^ 2/[k{l + S)] - 1 1 

du Qs{u) u — 1 a((5 +1) 


(S33) 


As stated, at k = kc the first (diverging) term has to vanish. Therefore, we find 


kc = 


2 

TTs' 


(S34) 


This result consistently reduces to Ac = 1, at 5 —>■ 1. Note, that the value of k at which the PDF flips at m = — 1 is 
lower than kc and is obtained at k = 26/{I + S) < kc- Only for values of k lower than this value, the PDF actually 
becomes unimodal, and the mode is given by Eq. (S32), which is indeed valid as long as u* > —1 or A < 25/{\ + <5). 
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